utf8proc.git
8 months agoci: bump actions/checkout from 4 to 5 (#300)
dependabot[bot] [Tue, 12 Aug 2025 16:13:32 +0000 (12:13 -0400)]
ci: bump actions/checkout from 4 to 5 (#300)

Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 5.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v4...v5)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
9 months agolink other language bindings
Steven G. Johnson [Tue, 15 Jul 2025 21:26:15 +0000 (16:26 -0500)]
link other language bindings

9 months agoMerge pull request #296 from Techcable/fix/docs-issue-295
Erik Schnetter [Mon, 14 Jul 2025 16:45:57 +0000 (12:45 -0400)]
Merge pull request #296 from Techcable/fix/docs-issue-295

Fix documentation for `UTF8PROC_COMPOSE` (#295)

9 months agoFix documentation for UTF8PROC_COMPOSE (#295)
Techcable [Mon, 14 Jul 2025 13:49:31 +0000 (06:49 -0700)]
Fix documentation for UTF8PROC_COMPOSE (#295)

The documentation for UTF8PROC_COMPOSE and UTF8PROC_DECOMPOSE options was the same.
Fic

9 months agoFix ASAN errors due adding offset to nullptr (#240)
parmeet [Thu, 10 Jul 2025 20:53:52 +0000 (16:53 -0400)]
Fix ASAN errors due adding offset to nullptr (#240)

* Fix ASAN errors due adding offset to nullptr

* Revert "Fix ASAN errors due adding offset to nullptr"

This reverts commit b933f42d51e6a8e278e6ce1932e6e027e07d5f51.

* make changes compact

10 months agoFix compilation in C90 mode (#284)
Thomas Koutcher [Fri, 20 Jun 2025 20:10:58 +0000 (22:10 +0200)]
Fix compilation in C90 mode (#284)

10 months agocheck max size of utf8proc_decompose_char buffer (#291)
Steven G. Johnson [Fri, 20 Jun 2025 20:10:14 +0000 (16:10 -0400)]
check max size of utf8proc_decompose_char buffer (#291)

* check max size of utf8proc_decompose_char buffer

* cmake rule for maxdecomposition test

* Update maxdecomposition.c

10 months agomark as alpha in NEWS
Steven G. Johnson [Fri, 20 Jun 2025 19:49:30 +0000 (15:49 -0400)]
mark as alpha in NEWS

10 months agomake release date TBD
Steven G. Johnson [Fri, 20 Jun 2025 19:47:39 +0000 (15:47 -0400)]
make release date TBD

10 months agoadd NEWS link
Steven G. Johnson [Fri, 20 Jun 2025 19:46:03 +0000 (15:46 -0400)]
add NEWS link

10 months agoPrepare release 2.11.0 (#293)
Erik Schnetter [Fri, 20 Jun 2025 19:45:40 +0000 (15:45 -0400)]
Prepare release 2.11.0 (#293)

* Prepare release 2.11.0

* Update soversion

* Update MANIFEST

10 months agoUpdate to UnicodeData 17.0.0 (#292)
Erik Schnetter [Wed, 18 Jun 2025 17:46:52 +0000 (13:46 -0400)]
Update to UnicodeData 17.0.0 (#292)

* Update to UnicodeData 17.0.0

* Tests: Handle grapheme strings containing NUL

* Tests: Remove left-over assert statement

* Correct build errors

* Correct build errors

* Update internal version numbers

* Update more version numbers

* Remove unwanted file

10 months agoMerge pull request #260 from kevinAlbs/add-cmake-config-file-package
Erik Schnetter [Sat, 7 Jun 2025 21:05:57 +0000 (17:05 -0400)]
Merge pull request #260 from kevinAlbs/add-cmake-config-file-package

Add CMake Config-file package

11 months agoremove test build directory from .gitignore
Kevin Albertson [Thu, 22 May 2025 00:21:31 +0000 (20:21 -0400)]
remove test build directory from .gitignore

11 months agofix comparison
Kevin Albertson [Tue, 20 May 2025 00:25:36 +0000 (20:25 -0400)]
fix comparison

11 months agouse `CMAKE_INSTALL_PREFIX` instead of `--prefix`
Kevin Albertson [Mon, 19 May 2025 23:58:58 +0000 (19:58 -0400)]
use `CMAKE_INSTALL_PREFIX` instead of `--prefix`

11 months agorevise Unix `PATH` setting
Kevin Albertson [Tue, 20 May 2025 00:04:29 +0000 (20:04 -0400)]
revise Unix `PATH` setting

Remove unneeded export.
Prefix `PATH` to avoid referring unintentional binaries.

Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
11 months agouse `runner.os`
Kevin Albertson [Tue, 20 May 2025 00:03:39 +0000 (20:03 -0400)]
use `runner.os`

To allows future Windows additions to the test matrix.

Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
11 months agorevise Powershell PATH setting
Kevin Albertson [Tue, 20 May 2025 00:02:01 +0000 (20:02 -0400)]
revise Powershell PATH setting

Use capital `$Env` instead of `$env` to match conventions.
Prefix `PATH` to avoid referring unintentional binaries.
Prefer `\` over `/` to match conventions.

Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
11 months agoMerge remote-tracking branch 'upstream/master' into add-cmake-config-file-package
Kevin Albertson [Sat, 17 May 2025 18:37:49 +0000 (14:37 -0400)]
Merge remote-tracking branch 'upstream/master' into add-cmake-config-file-package

11 months agotest consuming CMake package
Kevin Albertson [Sat, 17 May 2025 13:52:36 +0000 (09:52 -0400)]
test consuming CMake package

15 months agorm blank line
Steven G. Johnson [Tue, 31 Dec 2024 20:17:14 +0000 (15:17 -0500)]
rm blank line

15 months agoprepare for 2.10 release (#281)
Steven G. Johnson [Tue, 31 Dec 2024 20:15:04 +0000 (15:15 -0500)]
prepare for 2.10 release (#281)

* prepare for 2.10 release

* update NEWS links

15 months agosilence warnings: use int32_t for chars more consistently (#282)
Steven G. Johnson [Tue, 31 Dec 2024 19:58:17 +0000 (14:58 -0500)]
silence warnings: use int32_t for chars more consistently (#282)

16 months agoMerge pull request #277 from eschnett/eschnett/unicode16
Erik Schnetter [Sun, 29 Dec 2024 20:15:14 +0000 (15:15 -0500)]
Merge pull request #277 from eschnett/eschnett/unicode16

Redesign combining table

16 months agoOptimize table layout
Erik Schnetter [Sun, 29 Dec 2024 18:38:01 +0000 (13:38 -0500)]
Optimize table layout

16 months agoBump SOINDEX
Erik Schnetter [Sat, 28 Dec 2024 19:40:44 +0000 (14:40 -0500)]
Bump SOINDEX

16 months agoci: bump actions/checkout from 2 to 4 (#278)
dependabot[bot] [Sat, 28 Dec 2024 18:29:04 +0000 (13:29 -0500)]
ci: bump actions/checkout from 2 to 4 (#278)

Bumps [actions/checkout](https://github.com/actions/checkout) from 2 to 4.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v2...v4)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
16 months agoci: bump actions/upload-artifact from 1 to 4 (#279)
dependabot[bot] [Sat, 28 Dec 2024 18:28:54 +0000 (13:28 -0500)]
ci: bump actions/upload-artifact from 1 to 4 (#279)

Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 1 to 4.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](https://github.com/actions/upload-artifact/compare/v1...v4)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
16 months agoUpdate minimum cmake version to 3.10 (#274)
dundargoc [Sat, 28 Dec 2024 18:28:37 +0000 (19:28 +0100)]
Update minimum cmake version to 3.10 (#274)

This is to prevent the following warning:

CMake Deprecation Warning at CMakeLists.txt:1 (cmake_minimum_required):
  Compatibility with CMake < 3.10 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.

16 months agofuzz: improve code coverage (#273)
tyler92 [Sat, 28 Dec 2024 18:27:15 +0000 (20:27 +0200)]
fuzz: improve code coverage (#273)

16 months agoDescribe character combining table layout
Erik Schnetter [Sat, 28 Dec 2024 17:30:53 +0000 (12:30 -0500)]
Describe character combining table layout

16 months agoci: add dependabot to update actions (#275)
dundargoc [Sat, 28 Dec 2024 13:15:47 +0000 (14:15 +0100)]
ci: add dependabot to update actions (#275)

This will prevent CI failing from using deprecated actions.

16 months agoUpdate actions
Erik Schnetter [Fri, 20 Dec 2024 18:05:10 +0000 (13:05 -0500)]
Update actions

16 months agoutf8proc: Correct some types
Erik Schnetter [Thu, 19 Dec 2024 14:29:27 +0000 (09:29 -0500)]
utf8proc: Correct some types

16 months agoCorrect handling exclusions
Erik Schnetter [Wed, 18 Dec 2024 22:05:57 +0000 (17:05 -0500)]
Correct handling exclusions

16 months agoRedesign combining table
Erik Schnetter [Wed, 18 Dec 2024 20:55:35 +0000 (15:55 -0500)]
Redesign combining table

20 months agoupdate for Unicode 16.0.0
Steven G. Johnson [Fri, 30 Aug 2024 17:05:51 +0000 (13:05 -0400)]
update for Unicode 16.0.0

20 months agoproperties: add "ambiguous_width" property for ambiguous East Asian Width (#270)
bfredl [Fri, 30 Aug 2024 16:39:09 +0000 (18:39 +0200)]
properties: add "ambiguous_width" property for ambiguous East Asian Width (#270)

Some characters have their width defined as "Ambiguous" in UAX#11.
These are typically rendered as single-width by modern monospace fonts,
and utf8proc correctly returns charwidth==1 for these.

However some applications might need to support older CJK fonts where
characters which where two-byte in legacy encodings were rendered as
double-width. An example of this is the 'ambiwidth' option of vim
and neovim which supports rendering in terminals using such wideness
rules.

Add an 'ambiguous_width' property to utf8proc_property_t for such characters.

21 months agodocs: add examples for common usecases (#267)
dundargoc [Fri, 12 Jul 2024 02:01:19 +0000 (04:01 +0200)]
docs: add examples for common usecases (#267)

21 months agoUpdate README.md
Steven G. Johnson [Thu, 11 Jul 2024 13:16:17 +0000 (09:16 -0400)]
Update README.md

21 months agoci(macos): install julia dependency (#268)
dundargoc [Wed, 10 Jul 2024 11:48:32 +0000 (13:48 +0200)]
ci(macos): install julia dependency (#268)

Otherwise the job fails with the error message

"/bin/sh: julia: command not found"

22 months agorelocate `Using with CMake`
Kevin Albertson [Fri, 7 Jun 2024 16:57:37 +0000 (12:57 -0400)]
relocate `Using with CMake`

22 months agomake `Using with CMake` a subsection of `Quick Start`
Kevin Albertson [Fri, 7 Jun 2024 16:57:12 +0000 (12:57 -0400)]
make `Using with CMake` a subsection of `Quick Start`

22 months agochange `AnyNewerVersion` to `SameMajorVersion` for version compatibility
Kevin Albertson [Fri, 7 Jun 2024 16:56:22 +0000 (12:56 -0400)]
change `AnyNewerVersion` to `SameMajorVersion` for version compatibility

utf8proc appears to use SemVer. Do not consider different major versions compatible.

22 months agofix whitespace
Kevin Albertson [Fri, 7 Jun 2024 16:48:24 +0000 (12:48 -0400)]
fix whitespace

Co-authored-by: Sutou Kouhei <kou@cozmixng.org>
2 years agobuild: include clangd files to .gitignore (#263)
dundargoc [Mon, 29 Apr 2024 17:33:59 +0000 (19:33 +0200)]
build: include clangd files to .gitignore (#263)

2 years agobuild: remove unnecessary policy check (#262)
dundargoc [Mon, 8 Apr 2024 17:46:39 +0000 (19:46 +0200)]
build: remove unnecessary policy check (#262)

Minimum version is 3.5 and policy CMP0048 was introduced in 3.0, meaning
that it will always be set to `NEW`.

2 years agoadd `Using with CMake` instructions
Kevin Albertson [Mon, 15 Jan 2024 15:03:18 +0000 (10:03 -0500)]
add `Using with CMake` instructions

2 years agoinstall package version file
Kevin Albertson [Sat, 13 Jan 2024 23:46:34 +0000 (18:46 -0500)]
install package version file

2 years agocreate and install package config file
Kevin Albertson [Sat, 13 Jan 2024 23:18:15 +0000 (18:18 -0500)]
create and install package config file

2 years agoexport and install targets
Kevin Albertson [Sat, 13 Jan 2024 23:08:11 +0000 (18:08 -0500)]
export and install targets

2 years agouse relative paths for install
Kevin Albertson [Sat, 13 Jan 2024 23:07:29 +0000 (18:07 -0500)]
use relative paths for install

This is intended to make the installed package relocatable.

2 years agoRemove ruby compat hacks (#259)
Claire Foster [Thu, 4 Jan 2024 18:01:49 +0000 (04:01 +1000)]
Remove ruby compat hacks (#259)

* Fix two minor bugs from the Ruby code

First, `categroy` rather than `code` was used in constructing the
`control_boundary` property as related to the characters U+200C and
U+200D. This seemed incorrect and should be fixed. This could be an
observable bugfix for any C code which inspects the `control_boundary`
property.

Second, when reading composition exclusions, Ruby's String hex method
produces zero rather than nil if no number is found. For example

    $ ruby -e 'puts "# blah".hex'
    0

This led to the character `'\0'` being included in the `exclusions`
and `excl_versions` sets which is incorrect. However this seems
asymptomatic because `'\0'` is never part of a composition. (In terms of
the C code, the use of `comp_exclusion` is guarded by the `comb_index`
property which is `UINT16_MAX` for `'\0'`.)

* Cleanup: Remove sequence ordering hack

This hack changed the ordering of sequences encoded in the sequences
table and was added so we could easily prove equivalence to the Ruby
data generator code.

However, it's no longer needed and removing it shouldn't result in any
functional change.

2 years agoUse stdint.h instead of inttypes.h (#223)
Michael Williamson [Thu, 4 Jan 2024 00:34:07 +0000 (00:34 +0000)]
Use stdint.h instead of inttypes.h (#223)

This improves support for targeting wasm32 with clang 12.

2 years agoPort ruby data_generator.rb to Julia (#258)
Claire Foster [Wed, 3 Jan 2024 17:02:08 +0000 (03:02 +1000)]
Port ruby data_generator.rb to Julia (#258)

* Port ruby data_generator.rb to Julia

This reduces the number of dependencies needed when regenerating the C
code. The new code also separates C code generation from unicode data
analysis somewhat more cleanly which should be better factored for
generating a Julia version of the data files in the future.

The output is identical to the original Ruby script, for now. Some bugs
which were found in the process are noted as FIXMEs in the Julia source
and can be fixed next.

* Replace some explicit loops with a utility function

* fixup! Port ruby data_generator.rb to Julia

* Update Makefile

* Update data/Makefile

* Update data/Makefile

* Update data/Makefile

* Update data/Makefile

* Update data/data_generator.jl

---------

Co-authored-by: Steven G. Johnson <stevenj@mit.edu>
2 years agoupgrade minimum cmake version (#255)
dundargoc [Sun, 26 Nov 2023 02:00:49 +0000 (03:00 +0100)]
upgrade minimum cmake version (#255)

This will silence the following warning:

CMake Deprecation Warning at CMakeLists.txt:1 (cmake_minimum_required):
  Compatibility with CMake < 3.5 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.

2 years agoupdates for doxygen 1.9
Steven G. Johnson [Fri, 20 Oct 2023 22:38:02 +0000 (18:38 -0400)]
updates for doxygen 1.9

2 years agountar into new directory
Steven G. Johnson [Fri, 20 Oct 2023 21:16:42 +0000 (17:16 -0400)]
untar into new directory

2 years agomake distcheck should keep tarball, rm directory
Steven G. Johnson [Fri, 20 Oct 2023 21:15:11 +0000 (17:15 -0400)]
make distcheck should keep tarball, rm directory

2 years agoadd make distcheck
Steven G. Johnson [Fri, 20 Oct 2023 21:14:14 +0000 (17:14 -0400)]
add make distcheck

2 years agomake dist target
Steven G. Johnson [Fri, 20 Oct 2023 20:51:57 +0000 (16:51 -0400)]
make dist target

2 years agoversion 2.9 bump (#254)
Steven G. Johnson [Fri, 20 Oct 2023 20:42:25 +0000 (16:42 -0400)]
version 2.9 bump (#254)

2 years agoUnicode 15.1 support (#253)
Steven G. Johnson [Fri, 20 Oct 2023 20:24:59 +0000 (16:24 -0400)]
Unicode 15.1 support (#253)

* Unicode 15.1 support

* always update state

* fix GB9c logic

* print indic_conjunct_break in printproperty

* fix grapheme test

* update utf8proc_decompose_char docs

* more GB9c tests

3 years agov2.8.0 bump (#248)
Steven G. Johnson [Sun, 30 Oct 2022 21:24:01 +0000 (17:24 -0400)]
v2.8.0 bump (#248)

* version 2.8.0 bump

* NEWS link

3 years agounicode 15 support (#247)
Steven G. Johnson [Tue, 25 Oct 2022 03:18:17 +0000 (23:18 -0400)]
unicode 15 support (#247)

3 years agoAdd c flag when invoking ar (#241)
Harmen Stoppels [Tue, 25 Oct 2022 02:47:20 +0000 (04:47 +0200)]
Add c flag when invoking ar (#241)

`llvm-ar` warns when the archive does not exist and `c` is not passed.

3 years agoImprove fuzzer code coverage (#239)
Randy [Thu, 26 May 2022 12:58:54 +0000 (14:58 +0200)]
Improve fuzzer code coverage (#239)

* fuzz: test grapheme break functions

* fuzz: cover character lumping

3 years agofuzz: limit input length (#238)
Randy [Fri, 6 May 2022 01:49:11 +0000 (03:49 +0200)]
fuzz: limit input length (#238)

Longer inputs can lead to timeouts on oss-fuzz

4 years agodon't use make in cmake instructions (closes #236)
Steven G. Johnson [Sat, 16 Apr 2022 20:33:27 +0000 (16:33 -0400)]
don't use make in cmake instructions (closes #236)

4 years agoupdate Doxygen config with doxygen -u
Steven G. Johnson [Fri, 17 Dec 2021 02:14:53 +0000 (21:14 -0500)]
update Doxygen config with doxygen -u

4 years agocopyright year update
Steven G. Johnson [Fri, 17 Dec 2021 02:11:23 +0000 (21:11 -0500)]
copyright year update

4 years agoprepare for 2.7.0 release
Steven G. Johnson [Fri, 17 Dec 2021 02:10:08 +0000 (21:10 -0500)]
prepare for 2.7.0 release

4 years agoupdate for unicode 14 (#233)
Steven G. Johnson [Fri, 17 Dec 2021 02:08:37 +0000 (21:08 -0500)]
update for unicode 14 (#233)

4 years agorm travis
Steven G. Johnson [Fri, 17 Dec 2021 01:54:21 +0000 (20:54 -0500)]
rm travis

4 years agoupdate gitignore
Steven G. Johnson [Fri, 17 Dec 2021 01:53:02 +0000 (20:53 -0500)]
update gitignore

4 years ago[ci] set github CI (#229)
woclass [Fri, 17 Dec 2021 01:52:05 +0000 (19:52 -0600)]
[ci] set github CI (#229)

* [ci] set github CI: ubuntu, windows, macOS

* [ci] add make.yml

* [ci] Skip macOS check MANIFEST temporary

4 years agocmake: fix installation directories and also install pkgconfig file (#224)
Markus F.X.J. Oberhumer [Fri, 17 Dec 2021 01:42:22 +0000 (02:42 +0100)]
cmake: fix installation directories and also install pkgconfig file (#224)

4 years agoreduce lenencode bits (#232)
Benito van der Zander [Fri, 17 Dec 2021 01:30:27 +0000 (02:30 +0100)]
reduce lenencode bits (#232)

5 years agoGNUInstallDirs support (#159)
extrowerk [Thu, 15 Apr 2021 13:32:23 +0000 (15:32 +0200)]
GNUInstallDirs support (#159)

5 years agoOSS-Fuzz integration updates (#219)
Randy [Thu, 4 Feb 2021 17:59:39 +0000 (18:59 +0100)]
OSS-Fuzz integration updates (#219)

* fix build

* CIFuzz integration

* update fuzzer

* undo changes to build

* ossfuzz.sh: fix copy path

5 years agoOSS-Fuzz initial integration (#216)
Randy [Fri, 29 Jan 2021 18:54:58 +0000 (19:54 +0100)]
OSS-Fuzz initial integration (#216)

* add fuzz target

* update fuzzer

* add fuzzer to build with basic entry point

* add build script

* cleanup

* build fuzz target using cmake in oss-fuzz env

* ossfuzz.sh add newline

* update build

5 years agoFix Sign-Conversion warnings in library and test code (#214)
Mike Glorioso [Thu, 14 Jan 2021 17:59:49 +0000 (12:59 -0500)]
Fix Sign-Conversion warnings in library and test code (#214)

* JuliaStrings#169 turn on sign-conversion warnings

Signed-off-by: Mike Glorioso <mike.glorioso@gmail.com>
* JuliaStrings#169 fix sign-conversion warnings for utf8proc.c

fix sign-converstion warnings for utf8proc_iterate
uc requires at most 21 bits to identify a unicode codepoint, so there is no need for it to be unsigned
multiple locations use, modify, or store uc with a signed value
the only exception is line 137 where uc is compared with an unsigned value

fix sign-converstion warnings for utf8proc_tolower, utf8proc_toupper, utf8proc_totitle
all three methods have sign conversion warnings when calling seqindex_decode_index
seqindex_decode_index uses the passed value as an index to an array utf8proc_sequences
as utf8proc_sequences is hard-coded and smaller than 2^31 - 1 we can safely cast to unsigned

fix sign-converstion warnings for utf8proc_decompose_char
lines with this warning use the defined function utf8proc_decompose_lump
in the function, a hardcoded unsigned value (1<<12) is complemented then cast as a signed value
as the intent is to remove the 12th bit flag from options, a signed value, and explicit cast is safe

fix sign-conversion warnings for utf8proc_map_custom
result is declared as signed, but is only expected to contain values between 0 and 4
sizeof returns an unsigned value. result must be cast to unsigned

Signed-off-by: Mike Glorioso <mike.glorioso@gmail.com>
* JuliaStrings#169 fix sign-conversion warnings for test/*

fix sign-conversion warnings for test/tests.c encode
change type for d to match return value of utf8proc_encode_char

fix sign-conversion warnings for test/graphemetest.c checkline
si, i, and j are unsigned size types, utf8proc_map and utf8proc_iterate accept and return signed size types
utf8proc_map treats negative strlen values as 0. the strlen used by the test must be similarly limited
utf8proc_iterate treats negative strlen values as 4 which will be less than the unsigned size
fix unused-but-set-variable warning by checking the glen value

fix sign-conversion warnings for test/case.c main
the if block ensures that tested codepoint fits in wint_t, but needs to include u and l as well
c, u, and l can be safely cast to wint_t

fix sign-conversion warnings for test/iterate.c
all values used for len are below 8, so an explicit cast is safe
updated types for more portable test code

fix sign-conversion warnings for test/printproperty.c main
change type of c to signed to resolve all sign-converstion warnings.
replace sscanf(... &c) wiht sscanf(... &x) followed by explicit sign converstion

Signed-off-by: Mike Glorioso <mike.glorioso@gmail.com>
5 years agodownload test data to build directory (fixes #212)
Steven G. Johnson [Sat, 19 Dec 2020 18:08:34 +0000 (13:08 -0500)]
download test data to build directory (fixes #212)

5 years agoensure ruby is in UTF-8 mode (#209)
Steven G. Johnson [Thu, 17 Dec 2020 23:36:28 +0000 (18:36 -0500)]
ensure ruby is in UTF-8 mode (#209)

* ensure ruby is in UTF-8 mode

* Revert "ensure ruby is in UTF-8 mode"

This reverts commit 587b7b6b7215f91b1ae52aefc82d359f2f378a61.

* ensure Ruby reads files in UTF-8 encoding

5 years agofix manifest
Steven G. Johnson [Tue, 15 Dec 2020 21:36:45 +0000 (16:36 -0500)]
fix manifest

5 years ago2.6.1 version bump
Steven G. Johnson [Tue, 15 Dec 2020 20:29:32 +0000 (15:29 -0500)]
2.6.1 version bump

5 years agofix NULL args in grapheme_break_stateful
Steven G. Johnson [Tue, 15 Dec 2020 20:26:56 +0000 (15:26 -0500)]
fix NULL args in grapheme_break_stateful

5 years agoupdate doxygen config with doxygen -u
Steven G. Johnson [Mon, 23 Nov 2020 19:21:26 +0000 (14:21 -0500)]
update doxygen config with doxygen -u

5 years agobump to version 2.6
Steven G. Johnson [Mon, 23 Nov 2020 19:18:43 +0000 (14:18 -0500)]
bump to version 2.6

5 years agoFix grapheme breaks on string-initial (#205)
Steven G. Johnson [Mon, 23 Nov 2020 19:10:29 +0000 (14:10 -0500)]
Fix grapheme breaks on string-initial (#205)

* Fix extended emoji + zwj combo

* Patch initial repeated regional flags and extended+zwj emoj

* Merge conditions for setting breaks bt region

* updated fix

* perform tests for both utf8proc_map and manual calls to utf8proc_grapheme_break_stateful

* consolidate tests

Co-authored-by: Thomas Marks <marksta@umich.edu>
5 years agodocs: fix simple typo, encounted -> encountered (#201)
Tim Gates [Fri, 9 Oct 2020 12:30:50 +0000 (23:30 +1100)]
docs: fix simple typo, encounted -> encountered (#201)

There is a small typo in utf8proc.h.

Should read `encountered` rather than `encounted`.

5 years agoadd islower/isupper functions (#196)
Steven G. Johnson [Tue, 25 Aug 2020 20:42:59 +0000 (16:42 -0400)]
add islower/isupper functions (#196)

* add islower/isupper functions

* added test

* more tests + bugfix

* Makefile fix

* rm iscase test on make clean

5 years agoSwitch to HTTPS for referencing `www.unicode.org`. (#193)
xkszltl [Mon, 25 May 2020 14:20:08 +0000 (22:20 +0800)]
Switch to HTTPS for referencing `unicode.org`. (#193)

Resolve https://github.com/JuliaStrings/utf8proc/issues/192

6 years agoUnify include file handling (#190)
Stefan Floeren [Mon, 13 Apr 2020 14:59:30 +0000 (16:59 +0200)]
Unify include file handling (#190)

The cmake file expects the parent folder to be named "utf8proc",
otherwise the target_include_directories won't work, as it references
an unknown path.

This deviates from the install targets (both cmake and makefile) in
putting the include file into a subfolder in contrast to the top level
folder. This also prevents using the library with the recent cmake
addition of FetchContent.

This change unifies the include file handling by using the local path
for cmake as well.

This might break existing uses. As a workaround, we could add a dummy
include file in the old location (new utf8proc subfolder). I'm not sure
if that is necessary.

Co-authored-by: Stefan Floeren <stefan-floeren@users.noreply.github.com>
6 years agoFix memory leaks in tests case.c and misc.c (#189)
Andreas-Schniertshauer [Mon, 30 Mar 2020 11:51:44 +0000 (13:51 +0200)]
Fix memory leaks in tests case.c and misc.c (#189)

* Add: tests to CMakeLists.txt

* Disable compilation of charwidth, graphemetest and normtest because of missing getline

* Refactoring: UTF8PROC_ENABLE_TESTING default Off, move tests that don't compile on windows to NOT MSVC section, add testing to appveyor.yml

* Add: testing to travis

* Changed: flag to WIN32 because MinGW has the same problem as MSVC

* Commented out graphemetest and normtest because they fail.

* Re-added: graphemetest and normtest added missing data to the path of the text files.

* Fix: last commit was party wrong normtest failed.

* * Commented out graphemetest and normtest because they fail, because in CMakeLists is missing building of data.

* Add: mingw_static, mingw_shared, msvc_shared, msvc_static to ignore list

* Add: prefix utf8proc. to tests

* Fix: memory leaks in tests case.c and misc.c forgot to call free after calling utf8proc_NFKC_Casefold

Co-authored-by: Andreas-Schniertshauer <Andreas-Schniertshauer@users.noreply.github.com>
6 years agoRevert "disable tests under mingw" (#187)
Steven G. Johnson [Sun, 29 Mar 2020 14:48:42 +0000 (10:48 -0400)]
Revert "disable tests under mingw" (#187)

This reverts commit 7e834d77024d770875559d853b09b8bb7f9321a1.

6 years agouse unsigned char more consistently, silence -Wextra compiler warnings (#188)
Steven G. Johnson [Sun, 29 Mar 2020 14:44:42 +0000 (10:44 -0400)]
use unsigned char more consistently, silence -Wextra compiler warnings (#188)

6 years agofixes
Steven G. Johnson [Sun, 29 Mar 2020 13:35:32 +0000 (09:35 -0400)]
fixes

6 years agoadd build to gitignore, make paths absolute (closes #185)
Steven G. Johnson [Sun, 29 Mar 2020 13:01:04 +0000 (09:01 -0400)]
add build to gitignore, make paths absolute (closes #185)